Integrated circuit and method for establishing transactions

ABSTRACT

An integrated circuit comprising a plurality of processing modules (M; I; S; T) and a network (N; RN) arranged for providing at least one connection between a first and at least one second module is provided. Said connections comprises a set of communication channels each having a set of connection properties. Said connection supports transactions comprising outgoing messages from the first module to the second module and return messages from the second module to the first module. The connection properties of the different communication channels of said connection can be adjusted independently. Therefore, the utilization of the resources of a network on chip is more efficient, since the connection between modules can be efficiently adapted to their actual requirement, such that the connection is not over dimensioned and unused network resources can be assigned to other connections.

FIELD OF THE INVENTION

The invention relates to an integrated circuit having a plurality ofprocessing modules and a network arranged for providing connectionsbetween processing modules and a method for exchanging messages in suchan integrated circuit.

BACKGROUND OF THE INVENTION

Systems on silicon show a continuous increase in complexity due to theever increasing need for implementing new features and improvements ofexisting functions. This is enabled by the increasing density with whichcomponents can be integrated on an integrated circuit. At the same timethe clock speed at which circuits are operated tends to increase too.The higher clock speed in combination with the increased density ofcomponents has reduced the area which can operate synchronously withinthe same clock domain. This has created the need for a modular approach.According to such an approach the processing system comprises aplurality of relatively independent, complex modules. In conventionalprocessing systems the systems modules usually communicate to each othervia a bus. As the number of modules increases however, this way ofcommunication is no longer practical for the following reasons. On theone hand the large number of modules forms a too high bus load. On theother hand the bus forms a communication bottleneck as it enables onlyone device to send data to the bus. A communication network forms aneffective way to overcome these disadvantages.

Networks on chip (NoC) have received considerable attention recently asa solution to the interconnect problem in highly-complex chips. Thereason is twofold. First, NoCs help resolve the electrical problems innew deep-submicron technologies, as they structure and manage globalwires. At the same time they share wires, lowering their number andincreasing their utilization. NoCs can also be energy efficient andreliable and are scalable compared to buses. Second, NoCs also decouplecomputation from communication, which is essential in managing thedesign of billion-transistor chips. NoCs achieve this decoupling becausethey are traditionally designed using protocol stacks, which providewell-defined interfaces separating communication service usage fromservice implementation.

Using networks for on-chip communication when designing systems on chip(SoC), however, raises a number of new issues that must be taken intoaccount. This is because, in contrast to existing on-chip interconnects(e.g., buses, switches, or point-to-point wires), where thecommunicating modules are directly connected, in a NoC the modulescommunicate remotely via network nodes. As a result, interconnectarbitration changes from centralized to distributed, and issues likeout-of order transactions, higher latencies, and end-to-end flow controlmust be handled either by the intellectual property block (IP) or by thenetwork.

Most of these topics have been already the subject of research in thefield of local and wide area networks (computer networks) and as aninterconnect for parallel machine interconnect networks. Both are verymuch related to on-chip networks, and many of the results in thosefields are also applicable on chip. However, NoC's premises aredifferent from off-chip networks, and, therefore, most of the networkdesign choices must be reevaluated. On-chip networks have differentproperties (e.g., tighter link synchronization) and constraints (e.g.,higher memory cost) leading to different design choices, whichultimately affect the network services.

NoCs differ from off-chip networks mainly in their constraints andsynchronization. Typically, resource constraints are tighter on chipthan off chip. Storage (i.e., memory) and computation resources arerelatively more expensive, whereas the number of point-to-point links islarger on chip than off chip. Storage is expensive, becausegeneral-purpose on-chip memory, such as RAMs, occupy a large area.Having the memory distributed in the network components in relativelysmall sizes is even worse, as the overhead area in the memory thenbecomes dominant.

For on-chip networks computation too comes at a relatively high costcompared to off-chip networks. An off-chip network interface usuallycontains a dedicated processor to implement the protocol stack up tonetwork layer or even higher, to relieve the host processor from thecommunication processing. Including a dedicated processor in a networkinterface is not feasible on chip, as the size of the network interfacewill become comparable to or larger than the IP to be connected to thenetwork. Moreover, running the protocol stack on the IP itself may alsobe not feasible, because often these IPs have one dedicated functiononly, and do not have the capabilities to run a network protocol stack.

The number of wires and pins to connect network components is an orderof magnitude larger on chip than off chip. If they are not usedmassively for other purposes than NoC communication, they allow widepoint-to-point interconnects (e.g., 300-bit links). This is not possibleoff-chip, where links are relatively narrower: 8-16 bits.

On-chip wires are also relatively shorter than off chip allowing a muchtighter synchronization than off chip. This allows a reduction in thebuffer space in the routers because the communication can be done at asmaller granularity. In the current semiconductor technologies, wiresare also fast and reliable, which allows simpler link-layer protocols(e.g., no need for error correction, or retransmission). This alsocompensates for the lack of memory and computational resources.

Reliable communication: A consequence of the tight on-chip resourceconstraints is that the network components (i.e., routers and networkinterfaces) must be fairly simple to minimize computation and memoryrequirements. Luckily, on-chip wires provide a reliable communicationmedium, which can help to avoid the considerable overhead incurred byoff-chip networks for providing reliable communication. Data integritycan be provided at low cost at the data link layer. However, data lossalso depends on the network architecture, as in most computer networksdata is simply dropped if congestion occurs in the network.

Deadlock: Computer network topologies have generally an irregular(possibly dynamic) structure, which can introduce buffer cycles.Deadlock can also be avoided, for example, by introducing constraintseither in the topology or routing. Fat-tree topologies have already beenconsidered for NoCs, where deadlock is avoided by bouncing back packetsin the network in case of buffer overflow. Tile-based approaches tosystem design use mesh or torus network topologies, where deadlock canbe avoided using, for example, a turn-model routing algorithm. Deadlockis mainly caused by cycles in the buffers. To avoid deadlock, routingmust be cycle-free, because of its lower cost in achieving reliablecommunication. A second cause of deadlock are atomic chains oftransactions. The reason is that while a module is locked, the queuesstoring transactions may get filled with transactions outside the atomictransaction chain, blocking the access of the transaction in the chainto reach the locked module. If atomic transaction chains must beimplemented (to be compatible with processors allowing this, such asMIPS), the network nodes should be able to filter the transactions inthe atomic chain.

Data ordering: In a network, data sent from a source to a destinationmay arrive out of order due to reordering in network nodes, followingdifferent routes, or retransmission after dropping. For off-chipnetworks out-of-order data delivery is typical. However, for NoCs whereno data is dropped, data can be forced to follow the same path between asource and a destination (deterministic routing) with no reordering.This in-order data transportation requires less buffer space, andreordering modules are no longer necessary.

Network flow control and buffering strategy: Network flow control andbuffering strategy have a direct impact on the memory utilization in thenetwork. Wormhole routing requires only a flit buffer (per queue) in therouter, whereas store-and-forward and virtual-cut-through routingrequire at least the buffer space to accommodate a packet. Consequently,on chip, wormhole routing may be preferred over virtual-cut-through orstore-and-forward routing. Similarly, input queuing may be a lowermemory-cost alternative to virtual-output-queuing or output-queuingbuffering strategies, because it has fewer queues. Dedicated (lowercost) FIFO memory structures also enable on-chip usage ofvirtual-cut-through routing or virtual output queuing for a betterperformance. However, using virtual-cut-through routing and virtualoutput queuing at the same time is still too costly.

Time-related guarantees: Off-chip networks typically use packetswitching and offer best-effort services. Contention can occur at eachnetwork node, making latency guarantees very hard to offer. Throughputguarantees can still be offered using schemes such as rate-basedswitching or deadline-based packet switching, but with high bufferingcosts. An alternative to provide such time-related guarantees is to usetime-division multiple access (TDMA) circuits, where every circuit isdedicated to a network connection. Circuits provide guarantees at arelatively low memory and computation cost. Network resource utilizationis increased when the network architecture allows any left-overguaranteed bandwidth to be used by best-effort communication.

Introducing networks as on-chip interconnects radically changes thecommunication when compared to direct interconnects, such as buses orswitches. This is because of the multi-hop nature of a network, wherecommunication modules are not directly connected, but separated by oneor more network nodes. This is in contrast with the prevalent existinginterconnects (i.e., buses) where modules are directly connected. Theimplications of this change reside in the arbitration (which must changefrom centralized to distributed), and in the communication properties(e.g., ordering, or flow control).

An outline the differences of NoCs and buses will be given below. Werefer mainly to buses as direct interconnects, because currently theyare the most used on-chip interconnect. Most of the bus characteristicsalso hold for other direct interconnects (e.g., switches). Multilevelbuses are a hybrid between buses and NoCs. For our purposes, dependingon the functionality of the bridges, multilevel buses either behave likesimple buses or like NoCs. The programming model of a bus typicallyconsists of load and store operations which are implemented as asequence of primitive bus transactions. Bus interfaces typically havededicated groups of wires for command, address, write data, and readdata. A bus is a resource shared by multiple IPs. Therefore, beforeusing it, IPs must go through an arbitration phase, where they requestaccess to the bus, and block until the bus is granted to them.

A bus transaction involves a request and possibly a response. Modulesissuing requests are called masters, and those serving requests arecalled slaves. If there is a single arbitration for a pair ofrequest-response, the bus is called non-split. In this case, the busremains allocated to the master of the transaction until the response isdelivered, even when this takes a long time. Alternatively, in a splitbus, the bus is released after the request to allow transactions fromdifferent masters to be initiated. However, a new arbitration must beperformed for the response such that the slave can access the bus.

For both split and non-split buses, both communication parties havedirect and immediate access to the status of the transaction. Incontrast, network transactions are one-way transfers from an outputbuffer at the source to an input buffer at the destination that causessome action at the destination, the occurrence of which is not visibleat the source. The effects of a network transaction are observable onlythrough additional transactions. A request-response type of operation isstill possible, but requires at least two distinct network transactions.Thus, a bus-like transaction in a NoC will essentially be a splittransaction.

Transaction Ordering: Traditionally, on a bus all transactions areordered (cf. Peripheral VCI, AMBA, or CoreConnect PLB and OPB. This ispossible at a low cost, because the interconnect, being a direct linkbetween the communicating parties, does not reorder data. However, on asplit bus, a total ordering of transactions on a single master may stillcause performance penalties, when slaves respond at different speeds. Tosolve this problem, recent extensions to bus protocols allowtransactions to be performed on connections. Ordering of transactionswithin a connection is still preserved, but between connections thereare no ordering constraints (e.g., OCP, or Basic VCI). A few of the busprotocols allow out-of-order responses per connection in their advancedmodes (e.g., Advanced VCI), but both requests and responses arrive atthe destination in the same order as they were sent.

In a NoC, ordering becomes weaker. Global ordering can only be providedat a very high cost due to the conflict between the distributed natureof the networks, and the requirement of a centralized arbitrationnecessary for global ordering. Even local ordering, between asource-destination pair, may be costly. Data may arrive out of order ifit is transported over multiple routes. In such cases, to still achievean in-order delivery, data must be labeled with sequence numbers andreordered at the destination before being delivered. The communicationnetwork comprises a plurality of partly connected nodes. Messages from amodule are redirected by the nodes to one or more other nodes. To thatend the message comprises first information indicative for the locationof the addressed module(s) within the network. The message may furtherinclude second information indicative for a particular location withinthe module, such as a memory, or a register address. The secondinformation may invoke a particular response of the addressed module.

Atomic Chains of Transactions: An atomic chain of transactions is asequence of transactions initiated by a single master that is executedon a single slave exclusively. That is, other masters are denied accessto that slave, once the first transaction in the chain claimed it. Thismechanism is widely used to implement synchronization mechanisms betweenmaster modules (e.g., semaphores). On a bus, atomic operations caneasily be implemented, as the central arbiter will either (a) lock thebus for exclusive use by the master requesting the atomic chain, or (b)know not to grant access to a locked slave. In the former case, the timeresources are locked is shorter because once a master has been grantedaccess to a bus, it can quickly perform all the transactions in thechain (no arbitration delay is required for the subsequent transactionsin the chain). Consequently, the locked slave and the bus can be openedup again in a short time. This approach is used in AMBA and CoreConnect.In the latter case, the bus is not locked, and can still be used byother modules, however, at the price of a longer locking time of theslave. This approached is used in VCI and OCP.

In a NoC, where the arbitration is distributed, masters do not know thata slave is locked. Therefore, transactions to a locked slaved may stillbe initiated, even though the locked slave cannot accept them.Consequently, to prevent deadlock, these other transactions in theatomic chain must be able to bypass them to be served. Moreover, thetime a module is locked is much longer in case of NoCs, because of thehigher latency per transaction.

Media Arbitration: An important difference between buses and NoCs is inthe medium arbitration scheme. In a bus, master modules request accessto the interconnect, and the arbiter grants the access for the wholeinterconnect at once. Arbitration is centralized as there is only onearbiter component, and global as all the requests as well as the stateof the interconnect are visible to the arbiter. Moreover, when a grantis given, the complete path from the source to the destination isexclusively reserved. In a non-split bus, arbitration takes place oncewhen a transaction is initiated. As a result, the bus is granted forboth request and response. In a split bus, requests and responses arearbitrated separately.

In a NoC arbitration is also necessary, as it is a shared interconnect.However, in contrast to buses, the arbitration is distributed, becauseit is performed in every router, and is based only on local information.Arbitration of the communication resources (links, buffers) is performedincrementally as the request or response advances .

Destination Name and Routing: For a bus, the command, address, and dataare broadcasted on the interconnect. They arrive at every destination,of which one activates based on the broadcasted address, and executesthe requested command. This is possible because all modules are directlyconnected to the same bus. In a NoC, it is not feasible to broadcastinformation to all destinations, because it must be copied to allrouters and network interfaces. This floods the network with data. Theaddress is better decoded at the source to find a route to thedestination module. A transaction address will therefore have two parts:(a) a destination identifier, and (b) an internal address at thedestination.

Latency: Transaction latency is caused by two factors: (a) the accesstime to the bus, which is the time until the bus is granted, and (b) thelatency introduced by the interconnect to transfer the data. For a bus,where the arbitration is centralized the access time is proportional tothe number of masters connected to the bus. The transfer latency itselftypically is constant and relatively fast, because the modules arelinked directly. However, the speed of transfer is limited by the busspeed, which is relatively low.

In a NoC, arbitration is performed at each router for the followinglink. The access time per router is small. Both end-to-end access timeand transport time increase proportionally to the number of hops betweenmaster and slave. However, network links are unidirectional and point topoint, and hence can run at higher frequencies than buses, thus loweringthe latency. From a latency prospective, using a bus or a network is atrade off between the number of modules connected to the interconnect(which affects access time), the speed of the interconnect, and thenetwork topology.

Data Format: In most modern bus interfaces the data format is defined byseparate wire groups for the transaction type, address, write data, readdata, and return acknowledgments/errors (e.g., VCI, OCP, AMBA, orCoreConnect). This is used to pipeline transactions. For example,concurrently with sending the address of a read transaction, the data ofa previous write transaction can be sent, and the data from an evenearlier read transaction can be received. Moreover, having dedicatedwire groups simplifies the transaction decoding; there is no need for amechanism to select between different kinds of data sent over a commonset of wires. Inside a network, there is typically no distinctionbetween different kinds of data. Data is treated uniformly, and passedfron one router to another. This is done to minimize the controloverhead and buffering in router. If seperate wires would be used foreach of the above-mentioned groups, separate routing, scheduling, andqueuing would be needed, increasing the cost of routers.

In addition, in a network at each layer in the protocol stack, controlinformation must be supplied together with the data (e.g., packet type,network address, or packet size). This control information is organizedas an envelope around the data. That is, first a header is sent,followed by the actual data (payload), followed possibly by a trailer.Multiple such envelopes may be provided for the same data, each carryingthe corresponding control information for each layer in the networkprotocol stack.

Buffering and Flow Control: Buffering data of a master (outputbuffering) is used both for buses and NoCs to decouple computation fromcommunication. However, for NoCs output buffering is also needed tomarshal data, which consists of (a) (optionally) splitting the outgoingdata in smaller packets which are transported by the network , and (b)adding control information for the network around the data (packetheader). To avoid output buffer overflow the master must not initiatetransaction that generate more data than the currently available space.Similarly to output buffering, input buffering is also used to decouplecomputation from communication. In a NoC, input buffering is alsorequired to un-marshal data.

In addition, flow control for input buffers differs for buses and NoCs.For buses, the source and destination are directly linked, and,destination can therefore signal directly to a source that it cannotaccept data. This information can even be available to the arbiter, suchthat the bus is not granted to a transaction trying to write to a fullbuffer.

In a NoC, however, the destination of a transaction cannot signaldirectly to a source that its input buffer is full. Consequently,transactions to a destination can be started, possibly from multiplesources, after the destination's input buffer has filled up. If an inputbuffer is full, additional incoming transitions are not accepted, andstored in the network. However, this appoach can easily leat to networkcongestion, as the data could be eventually stored all the way to thesources, blocking the links in between.

To avoid input buffer overflow connections can be used, together withend-to-end flow control. At connection set up between a master and oneor more slaves, buffer space is allocated at the network interfaces ofthe slaves, and the network interface of the master is assigned creditsreflecting the amount of buffer space at the slaves. The master can onlysend data when it has enough credits for the destination slave(s). Theslaves grant credits to the master when they consume data.

SUMMARY OF THE INVENTION

It is an object of the invention to provide an integrated circuit and amethod for exchanging messages in an integrated circuit with a moreeffectove usage of the properties of the network.

This object is achieved by an integrated circuit according to claim 1and a method for exchanging messages according to claim 7.

Therefore, and integrated circuit comprising a plurality of processingmodules M; I; S; T and a network N; RN arranged for providing at leastone connection between a first and at least one second module isprovided. Said connections comprises a set of communication channelseach having a set of connection properties. Said connection supportstransactions comprising outgoing messages from the first module to thesecond module and return messages from the second module to the firstmodule. The connection properties of the different communicationchannels of said connection can be adjusted indepently.

Therefore, the utilization of the resources of a network on chip is moreefficient, since the connection between modules can be efficientlyadapted to their actual requirement such that the connection is not overdimensioned and unused network resources can be assigned to otherconnections.

The invention is based on the idea to allow connection channels of aconnention with different connection properties.

Acccording to an aspect of the invention, said integrated circuitcomprises at least one communication managaing means CM for managing thecommunication between different modules; and at least one resourcemanaging means RM for managing the resources of the network N.

According to a further aspect of the invention, said first module M; Iissues a request for a connection to at least one of said second modulesto said communication managing means CM. Said communication managingmeans CM forwards the request for a connection with communicationchannels each having a specific set of connection properties to saidresource managing means (RM). Said resource managing means RM determineswhether the requested connention based on said communication channelswith said specific connection properties is available, and responds theavailability of the requested connection to said communication managingmeans CM. A connection between the first and second module isestablished based on the available properties or available networkresources required to implement the properties of said communicationchannels of said connection.

According to a still further aspect of the invention, said communicationmanaging means CM rejects establishing a connection based on theavailable connection properties when the available connection propertiesare not sufficient to perform the requested connenction between saidfirst and second module. The connection properties require that somenetwork resources are implemented, e.g. a throughput requires slotreservation and flow control requires buffering. Therefore, a connentionrequiring some properties is opened or not depending on the availabilityof these resources. Accordingly, the communication manager CM has somecontrol over the minimum requirements for a connection.

According to a further aspect of the invention, said communicationmanaging means CM issues a request to reset the connenction between saidfirst and second module, when said modules have successfully performedtheir transactions, so that the network resources can be used again forother connections.

According to still a further aspect of the invention, said integratedcircuit comprises at least one network interface means N1, associated toeach of said modules, for managing the communication betwen said modulesand said network N. Hence the modules can be designed indepents from thenetwork and can therefore be re-used.

The invention also relates to a method for exchanging messaes in anintegrated circuit comprising a plurality of modules as described above.The messages between the modules are exchanged over connections via anetwork. Said connections comprises a set of communication channels eachhaving a set of connention properties. Said connection through thenetwork supports transactions comprising outgoing messages from thefirst module to the second module and return messages from the secondmodule to the first module. The network manages the outgoing messages ina way different from the return messages, i.e. the connection channelscan be configured independently.

Further aspects of the invention are described in the dependent claims.

These amd other aspects of the invention are apparent from and will beelucidated with reference to the embodiment(s) described hereinafter.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a System on chip according to a first embodiment,

FIG. 2 shows a System on chip according to a second embodiment, and

FIG. 3 shows a System on chip according to a third embodiment.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The following embodiments relate to systems on chip, i.e. a plurality ofmodules on the same chip communicate with each other via some kind ofinterconnect. The interconnect is embodied as a network on chip NOC. Thenetwork on chip may include wires, bus time-division multiplexing,switch, and/or routers within a network. At the transport layer of saidnetwork, the communication between the modules are performed overconnections. A connection is considered as a set of channels, eachhaving a set of connection properties, between a first module and atleast one second module. For a connection between a first module and asingle second module, the connection comprises two channel, namely onefrom the first module to the second channel, i.e. the request channel,and a second form the second to the first module, i.e. the responsechannel. The request channel is reserved for data and messages from thefirst to the second, while the response channel is reserved for data andmessages from the second to the first module. However, if the connectioninvolves one first and N second modules, 2*N channels are provided. Theconnection properties may include ordering (data transport in order),flow control (a remote buffer is reserved for a connection, and a dataproducer will be allowed to send data only when it is guaranteed thatspace is available for the produced data), throughput (a lower bound onthroughput is guaranteed), latency (upper bound for latency isguaranteed), the lossiness (dropping of data), transmission termination,transaction completion, data correctness, priority, or data delivery.

FIG. 1 shows a System on chip according to a first embodiment. Thesystem comprises a master module M, two slave modules S1, S2. Eachmodule is connected to a network N via a network interface NI,respectively. The network interfaces NI are used as interfaces betweenthe master and slave modules M, S1, S2 and network N. The networkinterfaces NI are provided to manage the communication of the respectivemodules and the network N, so that the modules can perform theirdedicated operation without having to deal with communication with thenetwork or other modules. The network interfaces NI can send requestssuch as read rd and write wr between each other over the network.

FIG. 2 shows a system on chip accoding to a second embodiment. Thesystem comprises two modules, namely an initiator I and a target T, arouter network RN, and two network interfaces ANIP, PNIP between themodules and the router network RN. The network interfaces provide twonetwork interfaces ports NIP (one request and one response port) throughwhich the modules communicate with the router network RN or othermodules via the router network RN. The network interface has one or moreports where modules can be connected. Two different type of ports areavailable, namely the active network interface port ANIP, which isconnected to masters and the passive network interface port PNIP, whichis connected to slave. The communication between the initiator module Iand the target module T is based on request-response transactions, wherethe master, i.e. the initiator module L initiates a transaction byplacing a request, possibly with some data or required connectionproperties. The request REQ is delivered to the slave, i.e. the targetmodule T, via the active network interface port ANIP, the network RN andthe passive network interface port PNIP. The request is executed by thetarget module T and data is returned as a response RESP if necessary orrequired. This response RESP may include data and/or acknowledgement forthe master or initiator module I.

The modules as described in FIG. 1 and 2 can be so-called intellectualproperty blocks IPs (computation elements, memories or a subsystem whichmay internally contain interconnect modules) that interact with networkat said network interfaces NI. NIs provide NI ports NIP through whichthe communication services are accessed. A NI can have several NIPs towhich one or more IPs can be connected. Similarly, an IP can beconnected to more than one NI and NIP.

The communication over the network is performed by the networkinterfaces on connections. Connections are introduced to describe andidentify communication with different properties, such as guaranteedthroughput, bounded latency and jitter, ordered delivery, or flowcontrol. For example, to distinguish and independently guaranteecommunication of 1 Mbs and 25 Mbs, two connections can be used. Two NIPscan be connected by multiple connections, possibly with differentproperties. Connections as defined here are similar to the concept ofthreads and connections from OCP and VCI. Where in OCP and VCIconnections are used only to relax transaction ordering, we generalizefrom only the ordering property to include among others configuration ofbuffering and flow control, guaranteed throughput, and bounded latencyper connection.

FIG. 3 shows a system on chip according to a third embodiment. Thesystem of the third embodiment is based on the system according to thesecond embodiment and additionally comprises a communication manager CMand a resource manager RM. Here, the communication manager CM and theresource manager RM are arranged in the router network RN, but they canalso be arranged in one or some of the network interfaces NI.

If the initiator module I needs to read or to write data from/to thetarget module T, a request REQ for a connection with the target module Tis issued. This request REQ is send to a request port of the activenetwork interface ports ANIP of the network interface associated to saidinitiator I. This request can contain information regarding anidentification of at least one target module, i.e. a target ID, as wellregarding the properties of the connection between the initiator moduleI and the target module T. The properties of the connection between thetwo modules may depend on the direction, i.e. the properties of therequest channel can be different of the properties of the responsechannel.

The communication manager CM request a connection with a set ofproperties between two modules from the resource manager RM, afterreceiving the request REQ from the active network interface ports ANIP,i.e. the properties of a connection, e.g. throughput, flow control, areto be requested when asking for a connection setup. The resource managerRM allocates the necessary resources and enquires if such a connectionbased on these resources is possible. The properties require resourcesto be implemented (e.g., throughput requires slot reservations, flowcontrol requires buffering). Therefore, a connection requiring someproperties is opened or not depending on the availability of theseresources. The availability of the connection properties correspond tothe ability of the network to fulfill or provide the resources for theconnection properties identified in the connection setup request. Theallocation of the resources can be preformed in two ways. First, theresource manager RM may contain a property table with entries for allthe different properties of the channels and connections. Alternatively,one central property table can be provided containing all differentproperties of the network, i.e. the table can either be central ordistributed.

After enquiring of the available connection properties, the resourcemanager RM reserves the available connection i.e. the requiredresources, and responds to the communication manager CM which connectionis available, i.e. which connection properties or the required resourcesthereof are available for the desired channels. Optionally, thecommunication manager CM can accept the connection with the availableproperties, but on the other hand the communication manager CM may alsorefuse the offered connection if the available properties or theresources thereof are not acceptable. If the communication manager CMrefuses the offered connection, it sends an error message to theinitiator module I (i.e. via the ANIP) and requests the resource moduleRM to release the reservation of said connections. Otherwise, theresource manager RM sets the connection properties, and establishes aconnection with accepted properties between said initiator and targetmodules. After said two modules I, T have performed the transactions asrequested by the initiator module, the communication manager CM issues arequest to said resource manager RM to reset the connection or theconnection properties.

The connections according to the embodiments of the invention must befirst created or established with the desired properties before beingused. This may result in resource reservations inside the network (e.g.,buffer space, or percentage of the link usage per time unit). If therequested resources are not available, the network RN will refuse therequest. After usage, connections are closed, which leads to freeing theresources occupied by that connection.

To allow more flexibility in configuring connections, and, hence, betterresource allocation per connection, the outgoing and return parts ofconnections can be configured independently. For example, a differentamount of buffer space can be allocated in the NIPs at the master andslaves, or different bandwidths can be reserved for requests andresponses.

An example for the use of differential properties for the outgoing andreturn parts is described as follows. Guaranteed-throughput connectionscan overbook resources in some cases. For example, when an ANIP opens aguaranteed-throughput read connection, it must reserve slots for theread command messages, and for the read data messages. The ratio betweenthe two can be very large (e.g., 1:100), which leads either to a largenumber of slots, or bandwidth being wasted for the read commandmessages.

To solve this problem, the connection properties of the request andresponse parts of a connection can be configured independently for allof throughput, latency and jitter. Consequently, the connectionproperties of request part of a connection can be best effort, while theconnection properties of response can have guaranteed throughput (orvice versa). For the example mentioned above, we can use best effortread messages, and guaranteed-throughput read-data messages. No globalconnection guarantees can be offered in this case, but the overallthroughput can be higher and more stable than in the case of using onlybest-effort traffic.

A connection on which only read commands are executed for large blocksof data can be considered as further example. In such a case, if flowcontrol is implemented, a network interface PNIP associated to a slaveS, T would require a small buffer, while the network interface ANIPassociated to a master M, I would require a large buffer. In analternative example a guaranteed throughput is required. Typically, aguaranteed throughput are provided by reserving fixed-sized slots from aslot table, where a bandwidth value is associated to each slot, so thatthe bandwidth values are summed if more slots are allocated.

In such a scheme there will be a minimum bandwidth that can beallocated. Therefore, allocating bandwidth only for read commands wouldbe an inefficient use of the bandwidth as read commands would only use asmall fraction. The return data responses would probably use enoughbandwidth to justify the slow reservation. To prevent such aninefficient bandwidth utilization, the request part of the connectioncan be set up as best effort (i.e. no throughput guarantees) andguaranteed throughput only for the response part. Accordingly, nooverall guarantees will be offered but for a reasonable loaded network,such a connection will still perform well. For a network with a low loadthis solution may be even faster as best-effort data can be transferredfaster (e.g. there is no need to wait for the assigned bandwidth slot).

Depending on the requested services, the time to handle a connection(i.e., creating, closing, modifying services) can be short (e.g.,creating/closing an unordered, lossy, best-effort connection) orsignificant (e.g., creating/closing a multicast guaranteed-throughputconnection). Consequently, connections are assumed to be created,closed, or modified infrequently, coinciding e.g. with reconfigurationpoints, when the application requirements change.

Communication takes place on connections using transaction, consistingof a request and possibly a response. The request encodes an operation(e.g., read, write, flush, test and set, nop) and possibly carriesoutgoing data (e.g., for write commands). The response returns data as aresult of a command (e.g., read) and/or an acknowledgment. Connectionsinvolve at least two NIPs. Transactions on a connection are alwaysstarted at one and only one of the NIPs, called the connection's activeNIP (ANIP). All the other NIPs of the connection are called passive NIPs(PNIP).

There can be multiple transactions active on a connection at a time, butmore generally than for split buses. That is, transactions can bestarted at the ANIP of a connection while responses for earliertransactions are pending. If a connection has multiple slaves, multipletransactions can be initiated towards different slaves. Transactions arealso pipelined between a single master-slave pair for both requests andresponses. In principle, transactions can also be pipelined within aslave, if the slave allows this.

A transaction can be composed from the following messages:

A command message (CMD) is sent by the ANIP, and describes the action tobe executed at the slave connected to the PNIP. Examples of commands areread, write, test and set, and flush. Commands are the only messagesthat are compulsory in a transaction. For NIPs that allow only a singlecommand with no parameters (e.g., fixed-size address-less write), weassume the command message still exists, even if it is implicit (i.e.,not explicitly sent by the IP).

An out data message (OUTDATA) is sent by the ANIP following a commandthat requires data to be executed (e.g., write, multicast, andtest-and-set).

A return data message (RETDATA) is sent by a PNIP as a consequence of atransaction execution that produces data (e.g., read, and test-and-set).

A completion acknowledgment message (RETSTAT) is an optional messagewhich is returned by PNIP when a command has been completed. It maysignal either a successful completion or an error. For transactionsincluding both RETDATA and RETSTAT the two messages can be combined in asingle message for efficiency. However, conceptually, they exist both:RETSTAT to signal the presence of data or an error, and RETDATA to carrythe data. In bus-based interfaces RETDATA and RETSTAT typically exist astwo separate signals.

Messages composing a transaction are divided in outgoing messages,namely CMD and OUTDATA, and response messages, namely RETDATA, RETSTAT.Within a transaction, CMD precedes all other messages, and RETDATAprecedes RETSTAT if present. These rules apply both between master andANIP, and PNIP and slave.

Connections can be classified as follows:

A simple connection is a connection between one ANIP and one PNIP.

A narrowcast connection is a connection between one ANIP and one or morePNIPs, in which each transaction that the ANIP initiates is executed byexactly one PNIP. An example of the narrowcast connection, where theANIP performs transactions on an address space which is mapped on twomemory modules. Depending on the transaction address, a transaction isexecuted on only one of these two memories.

A multicast connection is a connection between one ANIP and one or morePNIPs, in which the sent messages are duplicated and each PNIP receivesa copy of those messages. In a multicast connection no return messagesare currently allowed, because of the large traffic they generate (i.e.,one response per destination). It could also increase the complexity inthe ANIP because individual responses from PNIPs must be merged into asingle response for the ANIP. This requires buffer space and/oradditional computation for the merging itself.

Connecting properties that can be configured for a connection are asfollows: guaranteed message integrity, guaranteed transactioncompletion, various transaction orderings, guaranteed throughput,bounded latency and jitter, and connection flow control.

Data integrity means that the content of messages is not changed(accidentally or not) during transport. We assume that data integrity isalready solved at a lower layer in our network, namely at the linklayer, because in current on-chip technologies data can be transporteduncorrupted over links. Consequently, our network interface alwaysguarantees that messages are delivered uncorrupted at the destination.

A transaction without a response (e.g. a posted write) is said to becomplete when it has been executed by the slave. As there is no responsemessage to the master, no guarantee regarding transaction completion canbe given.

A transaction with a response (e.g. an acknowledged write) is said to becomplete when a RETSTAT message is received from the ANIP. Recall thatwhen data is received as a response (RETDATA), a RETSTAT (possiblyimplicit) is also received to validate the data. The transaction mayeither be executed successfully, in which case a success RETSTAT isreturned, fail in its execution at the slave, and then an executionerror RETSTAT is returned, or fail because of buffer overflow in aconnection with no flow control, and then it reports an overflow error.We assume that when a slave accepts a CMD requesting a response, theslave always generates the response.

In the network, routers do not drop data, therefore, messages are alwaysguaranteed to be delivered at the NI. For connections with flow control,also NIs do not drop data. Thus, message delivery and, thus, transactioncompletion to the IPs is guaranteed automatically in this case.

However, if there is no flow control, messages may be dropped at thenetwork interface in case of buffer overflow. All of CMD, OUTDATA, andRETDATA may be dropped at the NI. To guarantee transaction completion,RETSTAT is not allowed to be dropped. Consequently, in the ANIPs enoughbuffer space must be provided to accommodate RETSTAT messages for alloutstanding transactions. This is enforced by bounding the number ofoutstanding transactions.

Now the ordering requirements between different transactions within asingle connection are described. Over different connections no orderingof transactions is defined at the transport layer.

There are several points in a connection where the order of transactionscan be observed:(a) the order in which the master module M, I presentsCMD messages to the ANIP, (b) the order in which the CMDs are deliveredto the slave module T, S by the PNIP, (c) the order in which the slavemodule T, S presents the responses to the PNIP, and (d) the order theresponses are delivered to the master by the ANIP. Note that not all of(b), (c), and (d) are always present. Moreover, there are no assumptionsabout the order in which the slaves execute transactions; only the orderof the responses can be observed. The order of the transaction executionby the slaves is considered to be a system decision, and not a part ofthe interconnect protocol.

At both ANIP and PNIPs, outgoing messages belonging to differenttransactions on the same connection are allowed to be interleaved. Forexample, two write commands can be issued, and only afterwards theirdata If the order of OUTDATA messages differs from the order of CMDmessages, transaction identifiers must be introduced to associateOUTDATAs with their corresponding CMD.

Outgoing messages can be delivered by the PNEPs to the slaves (see b) asfollows:

Unordered, which imposes no order on the delivery of the outgoingmessages of different transactions at the PNIPs.

Ordered locally, where transactions must be delivered to each PNIP inthe order they were sent (a), but no order is imposed across PNIPs.Locally-ordered delivery of the outgoing messages can be provided eitherby an ordered data transportation, or by reordering outgoing messages atthe PNIP.

Ordered globally, where transactions must be delivered in the order theywere sent, across all PNIPs of the connection. Globally-ordered deliveryof the outgoing part of transactions require a costly synchronizationmechanism.

Transaction response messages can be delivered by the slaves to thePNIPs (c) as Ordered, when RETDATA and RETSTAT messages are returned inthe same order as the CMDs were delivered to the slave (b), or asUnordered, otherwise. When responses are unordered, there has to be amechanism to identify the transaction to which a response belongs. Thisis usually done using tags attached to messages for transactionidentifications (similar to tags in VCI).

Response messages can be delivered by the ANIP to the master (see d) asfollows:

Unordered, which imposes no order on the delivery of responses. Here,also, tags must be used to associate responses with their correspondingCMDs.

Ordered locally, where RETDATA and RETSTAT messages of transactions fora single slave are delivered in the order the original CMDs werepresented by the master to the ANIP. Note that there is no orderingimposed for transactions to different slaves within the same connection.

Globally ordered, where all responses in a connection are delivered tothe master in the same order as the original CMDs. When transactions arepipelined on a connection, then globally-ordered delivery of responsesrequires reordering at the ANIP.

All 3×2×3=18 combinations between the above orderings are possible. Outof these, we define and offer the following two. An unordered connectionis a connection in which no ordering is assumed in any part of thetransactions. As a result, the responses must be tagged to be ableidentify to which transaction they belong. Implementing unorderedconnections has low cost, however, they may be harder to use, andintroduce the overhead of tagging.

An ordered connection is defined as a connection with local ordering forthe outgoing messages from PNIPs to slaves, ordered responses at thePNIPs, and global ordering for responses at the ANIP. We choose localordering for the outgoing part because the global ordering has a toohigh cost, and has few uses. The ordering of responses is selected toallow a simple programming model with no tagging. Global ordering at theANIP is possible at a moderate cost, because all the reordering is donelocally in the ANIP. A user can emulate connections with global orderingof outgoing and return messages at the PNIPs using non-pipelinedacknowledged transactions, at the cost of high latency.

In the network, throughput can be reserved for connections in atime-division multiple access (TDMA) fashion, where bandwidth is splitin fixed-size slots on a fixed time frame. Bandwidth, as well as boundson latency and jitter can be guaranteed when slots are reserved. Theyare all defined in multiples of the slots.

As mentioned earlier, the network guarantees that messages are deliveredto the NI. Messages sent from one of the NIPs are not immediatelyvisible at the other NIP, because of the multi-hop nature of networks.Consequently, handshakes over a network would allow only a singlemessage be transmitted at a time. This limits the throughput on aconnection and adds latency to transactions. To solve this problem, andachieve a better network utilization, the messages must be pipelined. Inthis case, if the data is not consumed at the PNIP at the same rate itarrives, either flow control must be introduced to slow down theproducer, or data may be lost because of limited buffer space at theconsumer NI.

End-to-end flow control may be introduced at the level of connections,which requires buffer space to be associated with connections.End-to-end flow control ensures that messages are sent over the networkonly when there is enough space in the NIP's destination buffer toaccommodate them.

End-to-end flow is optional (i.e., to be requested when connections areopened) and can be configured independently for the outgoing and returnpaths. When no flow control is provided, messages are dropped whenbuffers overflow. Multiple policies of dropping messages are possible,as in off-chip networks. Possible scenarios include: (a) the oldestmessage is dropped (milk policy), or (b) the newest message is dropped(wine policy).

One example for flow control is a credit-based flow control. Credits areassociated with the empty buffer space at the receiver NI. The sender'scredit is lowered as data is sent. When the PNIP delivers data to theslave, credits are granted to the sender. If the sender's credit is notsufficient to send some data, the NI at the sender stalls the sending.

To illustrate the need for differentiated services on connections, someexamples of traffic are described below. Video processing streamstypically require a loss less, in-order video stream with guaranteedthroughput, but possibly allow corrupted samples. An connection for sucha stream would require the necessary throughput, ordered transactions,and flow control. If the video stream is produced by the master, onlywrite transactions are necessary. In such a case, with a flow-controlledconnection there is no need to also require transaction completion,because messages are never dropped, and the write command and its dataare always delivered at the destination. Data integrity is alwaysprovided by our network, even though it may be not necessary in thiscase.

Another example is that of cache updates which require uncorrupted, lossless, low-latency data transfer, but ordering and guaranteed throughputare less important. In such a case, a connection would not require anytime related guarantees, because a low latency, even if preferable, isnot critical. Low latency can be obtained even with a best-effortconnection. The connection would also require flow control andguaranteed transaction completion to ensure loss less transactions.However, no ordering is necessary, because this is not important forcache updates, and allowing out of order transaction can reduce theresponse time.

A set of NoC services is defined that abstract from the network details.Using these services in IP design decouples computation andcommunication. A request-response transaction model is used to be closeto existing on-chip interconnect protocols. This eases the migration ofcurrent IPs to NoCs. To fully utilize the NoC capabilities, such as highbandwidth and transaction concurrency, connection-oriented communicationare provided. Connections can be configured independently with differentproperties. These properties include transaction completion, varioustransaction orderings, bandwidth lower bounds, latency and jitter upperbounds, and flow control.

The provision of such network on chip NoC services is a prerequisite forservice-based system design which makes applications independent of NoCimplementations, makes designs more robust, and enablesarchitecture-independent quality-of-service strategies.

In other situations there may be different types of flow control (e.g.you never want to lose write commands, but don't mind losing read data).If a module can do both read and write commands, it may be importantthat write transactions always succeed (e.g. when writing to aninterrupt controller), but that read transactions are not criticalbecause they can be retried (so the CMD of the read transaction isdropped and the read never executed, or the RETDATA is dropped after theread has been executed. Another example is that if you know that writesalways succeed if they are delivered, a flow-controlled connection isrequested. Acknowledgements are not necessary in that case. Without flowcontrol acknowledgements are compulsory, complicating the master andcausing additional traffic.

In the integrated circuit according to the invention the decision todrop messages or not is not decided per transaction but for the outgoingand return parts of connection as a whole. For example all outgoingmessages having the format reads+address or writes+address+data) may beguaranteed lossless, while for all return messages (whether read data,write acknowledgements) packets may be dropped.

A connection could be opened as follows: connid = open ( nofc/fc,outgoing unordered/local/global, outgoing buffer size, returnunordered/local/global, return buffer size);i.e. all outgoing messages have certain properties, and all returnmessages have certain properties. Where fc represents flow control andnofc represents no flow control.

An alternative solution for deadlock in NoCs, which takes intoconsideration that modules connecting to the network are either masters(initiating requests and receiving responses), or slaves (receivingrequests and sending back responses), is to maintain separate virtualnetworks (with separate buffers) for requests and responses.

According to an embodiment of the invention, a method for exchangingmessages in an integrated circuit comprising a plurality of modulesrequests for a connection with specific properties between two modules,decides if said requested connection with said properties between saidtwo modules is possible, responds the connection with availableproperties, establishes a connection with said properties between saidtwo modules, and performs transactions between said two modules.Additionally, the available connection may be accepted and theproperties of said connection may be reset.

According to a further embodiment of the invention, the network has afirst mode wherein a message is transferred within a guaranteed timeinterval, and a second mode wherein a message is transferred as fast aspossible with the available resources, wherein the outgoing transactionis a read message, requesting the second module to send data to thefirst module, wherein the return transaction is the data generated bythe second module upon this request, and wherein the outgoingtransaction is transferred according to the second mode, and the returntransaction is transferred according to the first mode. For acknowledgedwrite transaction the write command and the outgoing data for the masteruse guaranteed throughput, and acknowledgment from the slaves uses besteffort. Except for time-related guarantees, there is also a distinctionon the buffering in the above examples. For data messages there ispotentially more buffering allocated than for commands andacknowledgments. Consequently, for a read transaction buffers for thereturn part would be larger than those for the outgoing part. For theacknowledged write buffers the outgoing part are larger, and those foracknowledgments are smaller.

According to a further embodiment, it is possible to allocate differentbandwidths for different channels. However, there are also limitations.A slot table is used, which contains a number of slots in a time window.Bandwidth is reserved allocating these slots to connections. Forexample, if a table with 100 slots for a time frame of 1 μs is used,each slot will be allocated for 1/100 from 1 μs=10 ns. If the networkprovides 1 Gb/s per link, the bandwidth per slot will be 1/100 from 1Gbs=10 Mb/s. Therefore, only multiple of 10 Mb/s for guaranteedthroughput traffic can be allocated. For a read command generating longbursts, allocating the minimum bandwidth of 10 Mb/s would be probably tomuch, as it will use only a small fraction of it. The bandwidth canindeed be used by best-effort traffic, however, not by other guaranteedthroughput traffic. As a result, not all the traffic for whichguarantees are needed may fit in the slot table. An alternative is touse more slots, but this increases the cost of the router. Accordingly,a best effort command may be a better solution.

According to still a further embodiment of the invention, a connectionbetween a master and two slave is provided, wherein different propertiesare assigned to the different channels from the master to the slaves.One of these slaves is a fast memory and the other one is a slow memory.A higher throughput is assigned to the channel connecting the master andthe fast memory.

As described above, NoCs have different properties from both existingoff-chip networks and existing on-chip interconnects. As a result,existing protocols and service interfaces cannot be adopted directly toNoCs, but must take the characteristics of NoCs into account. Forexample, a protocol such as TCP/IP assumes the network is lossy, andincludes significant complexity to provide reliable communication.Therefore, it is not suitable in a NoC where we assume data transferreliability is already solved at a lower level. On the other hand,existing on-chip protocols such as VCI, OCP, AMBA, or CoreConnect arealso not directly applicable. For example, they assume ordered transportof data: if two requests are initiated from the same master, they willarrive in the same order at the destination. This does not holdautomatically for NoCs. Atomic chains of transactions and end-to-endflow control also need special attention in a NoC interface.

The objectives when defining network services are the following. First,the services should abstract from the network internals as much aspossible. This is a key ingredient in tackling the challenge ofdecoupling the computation from communication, which allows IPs (thecomputation part), and the interconnect (the communication part) to bedesigned independently from each other. As a consequence, the servicesare positioned at the transport layer in the OSI reference model, whichis the first layer to be independent of the implementation of thenetwork. Second, a NoC interface should be as close as possible to a businterface. NoCs can then be introduced non-disruptively: with minorchanges, existing IPs, methodologies and tools can continue to be used.As a consequence, a request-response interface is used, similar tointerfaces for split buses. Third, the interface extends traditional businterfaces to fully exploit the power of NoCs. For example,connection-based communication which does not only relax orderingconstraints (as for buses), but also enables new communicationproperties, such as end-to-end flow control based on credits, orguaranteed throughput. All these properties can be set for eachconnection individually.

It should be noted that the above-mentioned embodiments illustraterather than limit the invention, and that those skilled in the art willbe able to design many alternative embodiments without departing fromthe scope of the appended claims. In the claims, any reference signsplaced between parentheses shall not be construed as limiting the claim.The word “comprising” does not exclude the presence of elements or stepsother than those listed in a claim. The word “a” or “an” preceding anelement does not exclude the presence of a plurality of such elements.In the device claim enumerating several means, several of these meanscan be embodied by one and the same item of hardware. The mere fact thatcertain measures are recited in mutually different dependent claims doesnot indicate that a combination of these measures cannot be used toadvantage.

Furthermore, any reference signs in the claims shall not be construed aslimiting the scope of the claims.

1. Integrated circuit comprising a plurality of processing modules (M;I; S; T) and a network (N; RN) arranged for providing at least oneconnection between a first and at least one second module, wherein theat least one connection comprises a set of communication channels eachhaving a set of connection properties, the connection properties of thedifferent communication channels of said connection being adjustableindependently, wherein said connection supports transactions comprisingoutgoing messages from the first module to the second module and/orreturn messages from the second module to the first module. 2.Integrated circuit according to claim 1, further comprising: at leastone communication managing means (CM) for managing the communicationbetween different modules; and at least one resource managing means (RM)for managing the resources of the network (N).
 3. Integrated circuitaccording to claim 2, wherein said first module (M; I) is adapted toissue a request (REQ) for a connection with at least one of said secondmodules to said communication managing means (CM), said communicationmanaging means (CM) is adapted to forward said request (REQ) for aconnection with communication channels each having a specific set ofconnection properties to said resource managing means (RM), saidresource managing means (RM) is adapted to determine whether therequested connection based on said communication channels with saidspecific connection properties are available, and to respond theavailability of the requested connection to said communication managingmeans (CM), wherein a connection between the first and second module isestablished based on the available properties of said communicationchannels of said connection.
 4. Integrated circuit according to claim 2,wherein said communication managing means (CM) is adapted to rejectestablishing a connection based on the available connection propertieswhen the available connection properties are not sufficient to performthe requested connection between said first and second module (M, I, S,T).
 5. Integrated circuit according to claim 2, wherein saidcommunication managing means (CM) is adapted to request a reset of theconnection between said first and second module(M, I, S, T), when saidmodules have successfully performed their transactions.
 6. Integratedcircuit according to claim 2, further comprising: at least one networkinterface means (NI), associated to each of said modules, for managingthe communication between said modules and said network (N).
 7. Methodfor exchanging messages in an integrated circuit comprising a pluralityof modules, the messages between the modules being exchanged overconnections via a network, wherein said connections comprises a set ofcommunication channels each having a set of connection properties, anycommunication channel being independently configurable, wherein saidconnection through the network supports transactions comprising outgoingmessages from the first module to the second module and/or returnmessages from the second module to the first module.